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We address this work to investigate symbolic sequences with long-range correlations by using 
computational simulation. We analyze sequences with two, three and four symbols that could be 
repeated I times, with the probability distribution p(l) oc For these sequences, we verified that 

the usual entropy increases more slowly when the symbols are correlated and the Tsallis entropy 
exhibits, for a suitable choice of q, a linear behavior. We also study the chain as a random walk-like 
process and observe a nonusual diffusive behavior depending on the values of the parameter [i. 
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1. INTRODUCTION 

Two basic assumptions of the statistical mechanics are: 
the "equal a priori probabilities" and ergodicity. When 
these assumptions do not hold, we need other suitable 
tools to study systems which exhibit a nonusual behav- 
ior. A typical situation can be found by analyzing sys- 
tems which have an intermediate regime between periodic 
and chaotic [1]. This kind of system commonly shows a 
power law spectra and appears in several fields of sci- 
ence. Aspects of nonusual behavior have been explored, 
for instance, in biology |2j, nuclear physics [3[, financial 
market Q, music Q and linguistics Q- In this context, 
there are also works that search for correlations in DNA 
sequences @-[ll[ by using entropic indexes fO - ulij . To 
provide a possible description for these systems which 
are not conveniently explained by the usual formalism, 
Tsallis [ni proposes an extension of the Boltzmann-Gibbs 
entropy. Many systems have been investigated by using 
this approach, e.e^ long-range Hamiltonian systems like 
the generalized Lennard-Jonnes gas 
and anomalous diffusion 



and the other emerges from the expression 



18] 



the HMF model 
19], self-gravitating systems [2 
21]. 



In this direction, to try to clarify in a more direct way 
basic aspects related to Tsallis entropy, it may be con- 
venient to consider specifc models with a kind of long- 
range behavior. Considering this, the aim of this work 
is to explore the nonusual behavior of a symbolic model 
with an adjustable long-range behavior. More precisely, 
we investigate one dimensional symbolic sequences with 
long-range correlations which are generated by using the 
numerical experiment presented in Ref. [22j |. The proce- 
dure uses two random numbers to obtain a lattice with 
N sites which represent the symbolic sequence. One of 
them, x, has a uniform distribution in the interval [0,1] 
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(l-x)V(M-l) 



(1) 



where A and fi are real parameters. We go through the 
symbolic sequence drawing x and filling N y — [y] + 1 
sites with the same value z, where [y] denotes the inte- 
ger part of y and z is a signal generator that can have 
one of four distinct values (0, 1, 2, 3) with the same sta- 
tistical weight. A typical example obtained within this 
procedure is 



Q = { 0,0,0,1,1,1,1,1, 0,0, 1, 1,1,1, v 0_ 

N y =3 N y =5 N y =2 JV„=4 = 1 

for a sequence with two symbols. 

For the sequences generated with the procedure de- 
scribed above, we may obtain the probability distribution 
function of the variable y, p(y), and show that, depending 
on the values of the \x, it can be asymptotically related 
to a Levy distribution (for y > 0). In fact, after some 
calculations, one can show that p{y) is given by 



p{y) = (m - !) 



{A + yY 



(2) 
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and the first moment of this distribution is 
(y) = A/ ((j, — 2). By comparing the asymptotic limit of 
Eq.@, p(y) ~ with the asymptotic limit of the 

Levy distributions, p(y) ~ l/y 1+v , the relation between 
fi and rj is fi = 1 + rj. Note also that (y) diverges for 
/j, — > 2. This fact indicates that, when /i is close to two, 
N y may assume large values and fill a large part of the 
symbolic sequence with the same symbol. On the other 
hand, when [i is far from two (/i 3> 2), large values of 
N y become very rare and consequently the sequence has 
more alternated symbols. 
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FIG. 1: S q versus L for some values of /i (indicated in the figure) for a two-symbol sequence. We use A = 2 and N = 10 s in 
all the three figures. 



2. ENTROPY AND SEQUENCE 

The Tsallis entropy is defined, for a system with W 
microstates and occupation probabilities pi, as follows: 



i-E 



W q 
i=lPi 



9-1 



(3) 



where q is a real parameter. In the limit q — > 1 we have 
the standard Boltzmann-Gibbs entropy. S q is extensive 
for a composite system consisting of independent sub- 
systems for q = 1 and nonextensive for q ^ 1; for this 
reason, S* g is sometimes referred to as nonextensive en- 
tropy. However, when we have long-range interactions or 
long-range correlations, the subsystems cannot be inde- 
pendent. In this case we will see that 5^ can be extensive 
for a particular value of q ^ 1. 

In order to evaluate the Tsallis entropy, for the sym- 
bolic sequence generated with the previous procedure, 
we fix windows of length L which are moved along the 
sequence. Then, we count how many times a given con- 
figuration (string) occurs, determining the probability pi 
of a specific configuration i. To illustrate this procedure, 
suppose that we have the following sequence: 

Q = {0,0, 1,0, 0,1, 1,1, 0,0, 0,1, 0,0, 0,0, 1,0, 1,1} , 

then we fix a window of length 2 and move it along the 
sequence, i.e., we have 



Q= { 0,0, 1,0,0,1, 1,1,0,0, 0,1,0,0,0,0,1,0, 1,1 

1234 5 6789 10 

where the index below the keys indicates time steps of 
the window's motion. The next step is to count how 
many times a given configuration occurs, for example, 
the configuration {0, 0} occurred 4 times (in the instants 
of "time" 1, 5, 7 and 8), leading to the probability 4/10. 
Similarly, we calculate the probability of other configu- 
rations and for other window lengths as well. 

Figure |T]) shows S q as a function of L for some values 
of fi. Note that for each value of /i there is only one value 



of q = q* that makes the relation S q versus L linear. 
This feature becomes evident when we look at the linear 
correlations (see the insets in Fig. (JXJ) ) . We can observe 
from the above results that when \i decreases q* also 
decreases. 

Motivated by the previous results, we investigate the 
relation q* versus fj, for two, three and four-symbol se- 
quences. The results are shown in Fig. @. Note that, 
when fi increases, q* tends to unity, and that the more 
symbols the sequence has, the faster it reaches towards 
one. This feature shows that large values of fi generate 
small values of N y and consequently the terms of the 
symbolic sequence becomes noncorrelated leading to the 
usual description based on the Boltzmann-Gibbs entropy. 
However, when \x decreases, N y is generally very large 
(remember that, when fj, < 2, all the moments of p(y) di- 
verge) and introduces correlation among the terms of the 
symbolic sequence which are not properly described by 
the usual formalism. The decreasing values of q* reflects 
this nonusual behavior. We emphasize that in this case 
the Tsallis entropy is extensive and Boltzmann-Gibbs en- 
tropy is not, indicating the applicability and robustness 
of the generalized entropy 
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FIG. 2: The entropic index q* versus \i for two, three and 
four-symbol sequences, with A = 2 and N = 10 s . 
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FIG. 3: The standard-deviation versus TV for /i = 2.2 and 
A = 0.1. 



3. DIFFUSION AND SEQUENCE 

In order to explore further aspects of a symbolic se- 
quence, let us consider it as an erratic trajectory and 
establish a correspondence with a diffusive process. For 
the case of two-symbol sequences, we associate the sym- 
bol "0" with a jump of unit length to the right and the 
symbol "1" with a jump of unit length to the left. That 
is, a random walk-like process. 

Using the previous prescription, we calculate the 
standard-deviation for i = 1 to N over 10 5 events as 
we can see in Fig. ([3]). We know that the slope a of 
this curve is one for a usual diffusion, but in the case of 
Fig-© ct is greater than one. We also observed that a 
depends on fi. This behavior is shown in Fig. (QJ.). Note 

that for small values of fi (/i < 3) the diffusion is anoma- 
lous, i.e., we have a superdiffusion, and for large values 
(ji > 3) the diffusion regimes tend to a usual diffusion. 
This behavior can be explained if we remember that for 
small values of fi, N y can be very large and consequently 
the walker can make large steps without changing the 
direction. When fi is large, this event becomes very rare 
because N y is in general small, making the walker change 
directions, producing a usual erratic trajectory. We may 
also connect a with q* through the values of fi. In order 
to do this we evaluate the relation q* versus fi as shown 
in Fig. (0J)) and exhibit q* versus a in Fig. ((It). 



4. DISCUSSION AND CONCLUSION 

We verified that by varying the value of fi we can pro- 
duce long-range correlations in symbolic sequences. This 
is evidenced by the nonlinear growing of the Boltzmann- 
Gibbs entropy. This feature led us to use the Tsallis 
entropy with suitable values of q to obtain a satisfactory 
description of these sequences. Specifically, we observed 
that the Tsallis entropy preserves the extcnsivity even 



when the terms of the symbolic sequence are correlated. 
1.80 



a 1.40 




FIG. 4: (a) The a slope of the curve in Fig ((3]) versus ft (b) 
the entropic index q* versus fi and (c) q* versus a. In the 
three figures we use A — 0.1. 



We also considered the symbolic sequence as a random 
walk-like process and evaluated the standard deviation. 
The result showed that the diffusive process presents a 
superdiffusive regime which emerges for small values of 
fi (/i < 3). The usual diffusion is recovered when fi > 3. 
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